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The minimal deterministic finite automaton is generally used to determine regular languages equal- 
ity. Antimirov and Mosses proposed a rewrite system for deciding regular expressions equivalence 
of which Almeida et al. presented an improved variant. Hopcroft and Karp proposed an almost 
linear algorithm for testing the equivalence of two deterministic finite automata that avoids minimi- 
sation. In this paper we improve the best-case running time, present an extension of this algorithm 
to non-deterministic finite automaton, and establish a relationship between this algorithm and the 
one proposed in Almeida et al. We also present some experimental comparative results. All these 
algorithms are closely related with the recent coalgebraic approach to automata proposed by Rutten. 

1 Introduction 

The uniqueness of the minimal deterministic finite automaton for each regular language is in general 
used for determining regular languages equality. Whether the languages are represented by deterministic 
finite automata (DFA), non deterministic finite automata (NFA), or regular expressions (r. e.), the usual 
procedure uses the equivalent minimal DFA to decide equivalence. The best known algorithm, in terms 
of worst-case analysis, for DFA minimisation is loglinear [9], and the equivalence problem is PSPACE- 
complete for both NFA and r. e. Based on the algebraic properties of regular expressions, Antimirov and 
Mosses proposed a terminating and complete rewrite system for deciding their equivalence [6]. In a paper 
about testing the equivalence of regular expressions, Almeida et al. [ 3 ] presented an improved variant of 
this rewrite system. As suggested by Antimirov and Mosses, and corroborated by further experimental 
results, a better average-case performance may be obtained. 

Hopcroft and Karp [10] presented, in 1971, an almost linear algorithm for testing the equivalence 
of two DFAs that avoids their minimisation. Considering the merge of the two DFAs as a single one, 
the algorithm computes the finest right-invariant relation which identifies the initial states. The state 
equivalence relation that determines the minimal DFA is the coarsest relation in that condition. 

We present some variants of Hopcroft and Karp's algorithm (HK) (Section 0, and establish a rela- 
tionship with the one proposed in Almeida et al. (Section SJ). In particular, we extend HK algorithm to 
NFAs and present some experimental comparative results (Section [5]). 

All these algorithms are also closely related with the recent coalgebraic approach to automata devel- 
oped by Rutten fl31 . where the notion of bisimulation corresponds to a right-invariance. Two automata 
are bisimilar if there exists a bisimulation between them. For deterministic (finite) automata, the coin- 
duction proof principle is effective for equivalence, i.e., two automata are bisimilar if and only if they 
are equivalent. Both Hopcroft and Karp algorithm and Antimirov and Mosses method can be seen as 
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instances of this more general approach (cf. Corollary |T0T>. This means that these methods may be easily 
extended to other Kleene Algebras, namely the ones that model program properties, and that have been 
successfully applied in formal program verification lfl2l . 

2 Preliminaries 

We recall here the basic definitions needed throughout the paper. For further details we refer the reader 
to the works of Hopcroft et at ifTTI and Kozen lfl3l . 

A regular expression (r. e.) a over an alphabet £ represents a (regular) language L(a) C £* and is 
inductively defined by: is ar. e. and L(0) = 0; e is ar. e. and L(e) = {e}; a G £is ar. e. and L(a) = {a}; 
if a and (3 are r. e., (a\ +02), (0102) and {a\)* are r. e., respectively with L((a\ +0:2)) = L(a\)\JL(a2), 
L((ai«2)) = L(a\)L(a>2) and L((a>\)*) = L(a l )*. We define e(a) = 1 (resp. e(a) = 0) if e G L(a) 
(resp. e ^ L(a)). Two r. e. a and f3 are equivalent, and we write a ~ /?, if L(a) = L(f5). The algebraic 
structure (RE, +,-,0,e), where iiE 1 denotes the set of r. e. over £, constitutes an idempotent semiring, 
and, with the unary operator *, a. Kleene algebra. There are several well-known complete axiomatizations 
of Kleene algebras. Let AC I denote the associativity, commutativity and idempotence of +. 

A nondeterministic finite automaton (NFA) A is a tuple (Q,L,S,I,F) where Q is a finite set of states, 
£ is the alphabet, <5CQxExQ the transition relation, ICQ the set of initial states, and F QQ the set of 
final states. An NFA is deterministic (DFA) if for each pair (q, a) G Q x £ there exists at most one q' such 
that (q,a,q') G 5. The size of a NFA is \Q\. For sGQ and a£l, we denote by 5(q,a) = {p \ (q,a,p) G 5}, 
and we can extend this notation to x G £*, and to R C Q. For a DFA, we consider 5 : Q x £* — » Q. The 
language accepted by ^4 is L(A) = {x G £* | 5(1, x) n F ^ 0}. Two NFAs A and £? are equivalent, 
denoted by yl ~ i3 if they accept the same language. Given an NFA A = (Qn,L, 8n,I, Fn), we can use 
the powerset construction to obtain a DFA D = (Q£,,L,5D,qo,Frj) equivalent to A, where Qrj = 2® N , 
qo = I, for all R G Qd, R G Frj if and only RC\Fn / 0, and for all a G £, Sd(R,o,) = U q eR^N(q,a)- 
This construction can be optimised by omitting states i? G Qd that are unreachable from the initial state. 

Given a finite automaton (Q,L,5,qo,F), let e(q) = 1 if q G -F and = otherwise. We call a 
set of states R C Q homogeneous if for every p,q *E R, e(p) = e(q). A DFA is minimal if there is no 
equivalent DFA with fewer states. Two states q\ , q2 G Q are said to be equivalent, denoted qi ~ 52, if for 
every w G £*, e(<5(gi =£((5(^2)^))- Minimal DFAs are unique up to isomorphism. GivenaDFAZ), 
the equivalent minimal DFA D is called the quotient automaton of D by the equivalence relation ~. 
The state equivalence relation ~, is a special case of a right-invariant equivalence relation w. r. t. D, 
i. e., a relation = C Q x Q such that all classes of = are homogeneous, and for any p, q G Q, a G £ if 
p = q, then 5(p,a)/= = 5(q,a)/=, where for any set S, S/= = {[s] \ s G S}. Finally, we recall that every 
equivalence relation = over a set S is efficiently represented by the partition of S given by S/=. Given 
two equivalence relations over a set S, =r and =t, we say that =r is finer then =t (and =t coarser 
then =ji) if and only if =/jC= r . 

3 Testing finite automata equivalence 

The classical approach to the comparison of DFAs relies on the construction of the minimal equivalent 
DFA. The best known algorithm for this procedure runs in O(fcnlogn) time [9], for a DFA with n states 
over an alphabet of k symbols. Hopcroft and Karp [ 10 ] proposed an algorithm for testing the equivalence 
of two DFAs that makes use of an almost 0(n) set merging method. 
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3.1 The original Hopcroft and Karp algorithm 

Let A = (Qi,T,,po,8i,F\) and B = (Q2,^,qo,52,F2) be two DFAs, with |Qi| = n, IQ2I = m > an d such 
that Qi and Q2 are disjoint. In order to simplify notation, we assume Q = Q\ U Q2, F = F\ U F2, and 
5(p,a) = 5i(p,a) for p £ Qi. We begin by presenting the original algorithm by Hopcroft and Karp IJJ 
for testing the equivalence of two DFAs as Algorithm [TJ 

If A and B are equivalent DFAs, the algorithm computes the finest right-invariant equivalence re- 
lation over Q that identifies the initial states, po and qo. The associated set partition is built using the 
UNION-FIND method. This algorithm assumes disjoint sets and defines the three functions which fol- 
low. 

• MAKE(i): creates a new set (singleton) for one element i (the identifier); 

• FIND(i): returns the identifier Si of the set which contains i; 

• UNION(z, j, k): combines the sets identified by i and j in a new set Sj- = Si U Sj\ Si and Sj are 
destroyed. 

It is clear that, disregarding the set operations, the worst-case time of the algorithm is 0(k(n + m)), 
where k = |E|. An arbitrary sequence of i MAKE, UNION, and FIND operations, j of which are MAKE 
operations in order to create the required sets, can be performed in worst-case time 0(ia(j)), where 
a(J) is related to a functional inverse of the Ackermann function, and, as such, grows very slowly. In 

,16 

fact, for every practical values of j (up to 2 2 ), a(j) < 4. 



1 def HK( A,B ) : 



2 for qeQ: MAKE(q) 

3 S = 

4 UNION(p ,50,«o); PUSH(S,(p ,5o)) 

5 while (p,q) = POP(S): 

6 for a G E: 

7 p' = FIND(<5(p,a)) 

8 q' = FTND(%,a)) 

9 if pVl': 

10 UNION (p',q',q') 

11 PUSH(S,(p',g')) 

12 if VS i Vp,qeS l e(jp) = e{q): return True 

13 else : return False 



Algorithm 1 : The original HK algorithm. 

When applied to Algorithm [TJ this set union algorithm allows for a worst-case time complexity 
of 0(k(n + m) + 3ia(j)) = 0(k(n + m) +3(n + m)a(n + m)). Considering a(n + m) constant, the 
asymptotic running-time of the algorithm is 0(k(n + m)). The correctness of this algorithm is proved in 
Section H Theorem U 

3.2 Improved best-case running time 

By altering the FIND function in order to create the set being looked for if it does not exist, i. e., whenever 
FIND(i) fails, MAKE(i) is called and the set Si = {i} is created, we may add a refutation procedure 
earlier in the algorithm. This allows the algorithm to return as soon as it finds a pair of states such that 
one is final and the other is not. This alteration to the FIND procedure avoids the initialization of m + n 
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sets which may never actually be used. These modifications to Algorithm Q] are presented in Algorithm 

m 

Although it does not change the worst-case complexity, the best-case analysis is considerably better, 
as it goes from Q.(k{n + m)) to £2(1). Not only it is possible to distinguish the automata by the first pair 
of states, but it is also possible to avoid the linear check in the lines 12-13. The observed asymptotic 
behaviour of minimality of initially connected DFAs (ICDFAs) 0, suggests that, when dealing with 
random DFAs, the probability of having two equivalent automata is very low, and a refutation method 
will be very useful (see Section[5]). 

Lemma 1. In line 5 of Algorithm\J} all the sets Si are homogeneous if and only if all the pairs of states 
(p, q) pushed into the stack are such that e(p) = e(q). 

Proof: Let us proceed by induction on the number I of times line 5 is executed. If I = 1, it is trivial. 
Suppose that lemma is true for the I th time the algorithm executes line 5. If for all o£l, the condition in 
line 9 is false, for the (/ + \) th time the homogeneous character of the sets remains unaltered. Otherwise, 
it is clear that in lines 10-11, S p > U S q i is homogeneous if and only if e(p') = e(q'). Thus the lemma is 



true. □ 



1 def HKi(A,B) : 

2 MAKE(po); MAKE(go) 

3 S = 

4 UNION(po,<ZO,«o); PUSH(S,(p ,5o)) 

5 while (p,q) = POP(S): 

6 if e(p)^e(q): return False 

7 for a G Z: 

8 p = FIND (50, a)) 

9 q' = FIND (S(q, a)) 

10 if pV?': 

11 UNION (p',q',q') 

12 PUSH(S,(p',g')) 

13 return True 



Algorithm 2: HK algorithm with an early refutation step (HKi). 
Theorem 2. Algorithms\l\(HK) and\2\(HKi) are equivalent. 

Proof: By LemmaQ] if there is a pair of states (p, q) pushed into the stack such that e{p) ^ s(q), then 
the algorithm can terminate and return False. That is exactly what Algorithm |2] does. □ 

3.3 Testing NFA equivalence 

It is possible to extend Algorithm [2] to test the equivalence of NFAs. The basic idea is to embed the 
powerset construction into the algorithm, although this must be done with some caution. Because of 
space limitations, we will only sketch this extension. We call this algorithm HKe. 

Let TYi = (Qi,E,6i,Ii,Fi) and N 2 = (Q2, £,£2,^2,-^2) be two NFAs. We assume that Q\ and Q 2 
disjoint, and, we make Qn = Qi UQ2, Fn = Fi UF2, and 5n(p,cl) = <$i{p,a) for p G Q; t . Consider 
Algorithm [2] with the following data: qo = I\, po = I 2 , and for p € 2^, 5(p,a) = U«ep^(^' a ) an( ^ 
s(p) = 1 if and only if 3q £ p : s(q) = 1. Notice that when dealing with NFAs it is essential to use the 
idea described in Subsection [3^2] and to adjust the FIND operation so that FIND(i) creates the set Si if it 
does not exist. This way we avoid calling MAKE for each of the 21*^1 sets, which would lead directly 
to the worst-case of the powerset construction. 
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Theorem 3. Algorithm\2\can be applied to NFAs by embedding the powerset construction method. 

As any DFA is a particular case of an NFA, all the experimental results presented on Section [5] use 
Algorithm HKe, whether the finite automata being tested are deterministic or not. 

4 Relationship with Antimirov and Mosses' method 
4.1 Antimirov and Mosses' algorithm 

The derivative JH of a r. e. a with respect to a symbol a G E, denoted a -1 (a), is defined recursively on 
the structure of a as follows: 

a~\%) = 0; a'\a + l3) = a~ l (a) + a" 1 (/3); 

a _1 (e) = 0; a -1 (a/3) = <r l (a)P + e(a)a- l (J3); 

a-\b) = \ e ; lf6 = 0; a-V^a-^a*. 
10, otherwise; 

This notion can be trivially extended to words, and considering r. e. modulo the AC I axioms, Brzo- 
zowski |8] proved that, the set of derivatives of a r. e. a, V{a), is finite. This result leads to the definition 
of Brzozowski's automaton which is equivalent to a given r. e. a: D a = (T>(a),T.,5 a ,a,F a ) where 
F a = {d G P(a) [ e(d) = e}, and 5 a (d,a) = a~\d), for all d G V{a), a G I. 

Antimirov and Mosses [6] proposed a rewrite system for deciding the equivalence of two extended 
r. e. (with intersection), based on a complete axiomatization. This is a refutation method such that 
testing the equivalence of two r. e. corresponds to an iterated process of testing the equivalence of their 
derivatives. In the process, a Brzozowski's automaton is computed for each r. e. Not considering extended 
r. e., Algorithm [3] is a version of AM's method, which was, essentially, the one proposed by Almeida et 
al. 0. 



1 def PM(a,/3): 



2 S = {(a,/3)} 

3 H = 

4 while (q,/3) = POP(S): 

5 if e(a)^e(/3): return False 

6 PUSH(H, (a, /3)) 

7 for a G E: 

8 a' = a~ l (a) 

9 P' = a~\p) 

10 if {a',/3')<£H: PUSH(S , {a',/3')) 

11 return True 



Algorithm 3: A simplified version of algorithm AM. 



4.2 A naive HK algorithm 

We now present a naive version of the Algorithm Q] It will be useful to prove its correctness and to 
establish a relationship to the Antimirov and Mosses' method (AM). Let A = (Qi,L,po,5i,Fi) and 
B = (Q2,T,,qo,52,F2) be two DFAs, with \Q\ \ = n and IQ2I = m > an d Qi and Q2 disjoint. Consider 
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Algorithm|4] Termination is guaranteed because the number of pairs of states pushed into S is at most mn 
and in each iteration one pair is popped from S. To prove the correctness we show that in H we collect 
the pairs of states of the relation R, defined below. 



1 def HKn(A,B): 



2 S = {(po,?o)} 

3 H = 

4 while (p,q) = POP(S): 

5 PUSH(H,(p,q)) 

6 for a € £: 

7 p =5\{p,a) 

8 q' = 8 2 (q,a) 

9 if (p',q') £ H: PUSH(S,(pW)) 

10 for (p,q) in H: 

11 if e{p)^e{q): return False 

12 return True 



Algorithm 4: The algorithm HKn, a naive version of HK. 

Lemma 4. In Algorithm® for all (p, q) G Qi x Q2, £ S in a step k > if and only if (p,q) G -ff 
/or some sfe/J A/ > k. 

Definition 5. Let R be defined as follows: 

R = {(p,q) G Qi x Q 2 | 3x G I* : <5i(p ,a;) =£> A5 2 (go,a;) = <?}• 
Lemma 6. For a// (p, q) G Qi X Q2» (p, q) £ S at some step of Algorithm® if and only if (p, q) G R. 
Lemma 7. In line 10, for all (p,q) G Q\ x Q2. (p> q) & R if and only if (p, q) G H. 

Considering Lemma[6]and Lemma[71 the following theorem ensures the correctness of Algorithm @] 
Theorem 8. A ~ B if and only if for all (p, q) G R, e(p) = e{q). 

Proof: Suppose, by absurd, that A and B are not equivalent and that the condition holds. Then, there 
exists uj£E' such that e(5(po,w)) ^ e(5(qo, w)). But in that case there is a contradiction because 
(5(po,w),5(qo,w)) G R. On the other hand, if there exists a (p,q) G R such that e(p) 7^ obviously 
A and are not equivalent. □ 

The relation R can be seen as a relation on (Q\ U Q2) 2 which is reflexive and symmetric. Its transitive 
closure R* is an equivalence relation. 

Lemma 9. y(p,q) G R, e(p) = e(q) if and only if y(p,q) G R*, e(p) = e(q). 
Corollary 10. A ~ B if and only if V(p,q) G R*, e(p) = e(q). 

The Algorithm HK computes R* by starting with the finest partition in Q\ VJQ2 (the identity). And if 
A ~ B, R* is a right-invariance. 

Corollary 11. Algorithm® and Algorithm\l}are equivalent. 
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4.3 Equivalence of the two methods 

The Algorithm @] can be modified to a earlier refutation version, as in Algorithm [2] In order to do so, 
we remove lines 10-11, and we insert a line equal to line 7 of Algorithm |2j before line 4. It is then 
obvious that Algorithm [3] corresponds to Algorithm |4] applied to Brzozowski's automata of two r. e., 
where these DFAs are incrementally constructed during the algorithm's execution. In particular, the 
halting conditions are the same considering the definition of final states in a Brzozowski's automaton. 

Theorem 12. Algorithm\3\(AM) corresponds to Algorithm^\(HKn) applied to Brzozowski's automata 
of two regular expressions. 

4.4 Improving Algorithm AM with Union-Find 

Considering the Theorem [12] and the Corollary [TT] we can improve the Algorithm [3] (AM) for testing 
the equivalence of two r. e. a and 0, by considering Algorithm Q] applied to the Brzozowski's automata 
correspondent to the two r. e. Instead of using a stack (H) in order to keep an history of the pairs of regular 
expressions which have already been tested, we can build the correspondent equivalence relation R* (as 
defined for Lemma©. Two main changes must be considered: 

• One must ensure that the sets of derivatives of each regular expression are disjoint. For that we 
consider their disjoint sum, where derivatives w. r. t. a word u are represented by tuples (u~ l (a), 1) 
and (u~ l (f3),2), respectively. 

• In the UNION-FIND method, the FIND operation needs an equality test on the elements of the set. 
Testing the equality of two r. e. — even syntactic equality — is already a computationally expensive 
operation, and tuple comparison will be even slower. On the other hand, integer comparison, can 
be considered to be O(l). As we know that each element of the set is unique, we may consider 
some hash function which assures that the probability of collision for these elements is extremely 
low. This allows us to safely use the hash values as the elements of the set, and thus, arguments 
to the FIND operation, instead of the r. e. themselves. This is also a natural procedure in the 
implementations of conversions from r. e. to automata. 

We call equivUF to the resulting algorithm. The experimental results are presented on Table P3j 
Section |5] 

4.5 Worst-case complexity analysis 

In Almeida et al. (3 ] the algorithm AM was improved by considering partial derivatives 0. The resulting 
algorithm (equivP) can be seen as the algorithm HKe applied to the partial derivatives NFA of a r. e. We 
present a lower bound for the worst-case complexity of this algorithm by exhibiting a family of r. e. for 
which the comparison method can be exponential on the number of alphabetical symbols \a\z of a r. e. a. 
We will proceed by showing that the partial derivatives NFA N of a r. e. a is such that \N\ G 0(|a|2;) 
and the number of states of the smallest equivalent DFA is exponential on \N\. 

Figure [Qpresents a classical example of a bad behaved case of the powerset construction, by Hopcroft 
et al. [11]. Although this example does not reach the 2 n states bound, the smallest equivalent DFA has 
exactly 2 n_1 states. 

Consider the r. e. family ai = (a + b)*a(a + b) 1 , where \at\z = 3 + 21 = m. It is easy to see that the 
NFA in FigureHJis obtained directly from the application of the AM method to ae, with the corresponding 
partial derivatives presented on Figure [2] 
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Figure 1 : NFA which has no equivalent DFA with less than 2 n states. 

a, b 

Figure 2: NFA obtained from the r. e. a using the AM method. 



The set of the partial derivatives 

PD(a e ) = {a e , (a + b) e , ...,(a+b),e} 



has I + 2 



m+l 



elements, which corresponds to the size of the obtained NFA. The equivalent minimal 



DFA has 2 e+l = 2 ! V states. 



5 Experimental results 

In this section we present some experimental results of the previously discussed algorithms applied 
to DFAs, NFAs, and r. e. We also include the same results of the tests using Hopcroft's (Hop) and 
Brzozowski's (Brz) [71 automata minimization algorithms. The random DFAs were generated using 
publicly available toolo2 |. The NFAs dataset was obtained with a set of tools described by Almeida 
et al. [4]. All the algorithms were implemented in the Python programming language. The tests were 
executed in the same computer, an Intel® Xeon® 5140 at 2.33GHz with 4GB of RAM. TableQ]shows the 



Table 1 : Running times for tests with complete accessible DFAs. 





n = 5 


n = 50 




fc = 2 


fc = 50 


fc = 2 


fc = 50 


Alg. 


Time (s) 


Iter. 


Time (s) 


Iter. 


Time (s) 


Iter. 


Time (s) 


Iter. 




Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Hop 


5.3 


7.3 




85.2 


91.0 




566.8 


572 




17749.7 


17787.5 




Brz 


25.5 


28.0 




1393.6 


1398.9 
















HK 


2.3 


4.0 


8.9 


25.3 


28.9 


9.0 


23.2 


28.9 


98.9 


317.5 


341.6 


99.0 


HKe 


0.9 


2.1 


2.4 


5.4 


10.5 


2.4 


1.4 


5.9 


2.6 


14.3 


34.9 


3.4 


HKs 


0.6 


1.3 


2.4 


2.8 


4.6 


2.4 


0.8 


2.0 


2.7 


9.1 


21.3 


3.4 


HKn 


0.7 


2.2 


3.0 


51.5 


56.2 


29.7 


1.3 


6.8 


3.7 


29.4 


51.7 


15.4 



results of experimental tests with 10.000 pairs of complete ICDFAs. Due to space constraints, we only 
present the results for automata with n G {5, 50} states over an alphabet of k € {2, 50} symbols. Clearly, 
the methods which do not rely in minimisation processes are a lot faster. Below (Eff.) appears the 
effective time spent by the algorithm itself while below (Total) we show the total time spent, including 
overheads, such as making a DFA complete, initializing auxiliary data structures, etc. All times are 
expressed in seconds, and the algorithms that were not finished after 10 hours are accordingly signaled. 



http : //www.ncc .up.pt/FAdo/nodel .html 
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The algorithm Brz is by far the slowest. The algorithm Hop, although faster, is still several orders 
of magnitude slower than any of the algorithms of the previous sections. We also present the average 
number of iterations (Iter.) used by each of the versions of algorithm HK, per pair of automata. Clearly, 
the refutation process is an advantage. HKn running times show that a linear set merging algorithm 
(such as UNION-FIND) is by far a better choice than a simple history (set) with pairs of states. HKs 
is a version of HKe which uses the automata string representation proposed by Almeida et al. El [T4ll. 
The simplicity of the representation seemed to be quite suitable for this algorithm, and actually cut down 
both running times to roughly half. This is an example of the impact that a good data structure may have 
on the overall performance of this algorithm. 



Table 2: Running times for tests with 10.000 random NFAs. 





n = 5 






n = 


50 








k = 2 


k = 20 


k = 2 


fc = 20 


Alg. 


Time (s) 


Iter. 


Time (s) 


Iter. 


Time (s) 


Iter. 


Time (s) 


Iter. 




Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Eff. 


Total 


Avg. 


Transition Density d = 0. 1 


Hop 


10.3 


12.5 




1994.7 


2003.2 




660.1 


672.9 










Brz 


8.4 


10.6 




866.6 


876.2 




264.5 


278.4 










HKe 


0.8 


2.9 


2.2 


8.4 


19 


4 


24.4 


37.8 


10.2 








Transition Density d = 0.5 


Hop 


17.9 


19.8 




2759.4 


2767.5 




538.7 


572.6 










Brz 


14.4 


16 




2189.3 


2191.6 




614.9 


655.7 










HKe 


2.6 


4.3 


4.9 


36.3 


47.3 


10.3 


6.8 


48.9 


2.5 


294.6 


702.3 


11.5 


Transition Density d = 0.8 


Hop 


12.5 


14.3 




376.9 


385.5 




1087.3 


1134.2 










Brz 


14 


15.8 




177 


179.6 




957.5 


1014.3 










HKe 


1.4 


3.2 


2.7 


39 


49.9 


10.7 


7.3 


64.8 


2.5 


440.5 


986.6 


11.5 



Table [2] shows the results of applying the same set of algorithms to NFAs. The testing conditions 
and notation are as before, adding only the transition density d as a new variable, which we define as 
the ratio of the number of transitions over the total number of possible transitions (kn 2 ). Although it is 
clear that HKe is faster, by at least one order of magnitude, than any of the other algorithms, the peculiar 
behaviour of this algorithm with different transition densities is not easy to explain. Considering the 
simplest example of 5 states and 2 symbols, the dataset with a transition density d = 0.5 took roughly 
twice as long as those with d € {0.1,0.8}. On the other extreme, making n = 50 and k = 2, the hardest 
instance was d = 0.1, with the cases where d S {0.5,0.8} present similar running times almost five times 
faster. In our largest test, with n = 50 and k = 20, neither Hop nor Brz finished within the imposed time 
limit. Again, d = 0.1 was the hardest instance for HKe, which also did not finish within the time limit, 
although the cases where d E {0.5,0.8} present similar running times. 

Table [3] presents the running times of the application of HKe to r. e. and their comparison with the 
algorithms presented by Almeida et al. [3], where equiv and equivP are the functional variants of the 
original AM algorithm. equivUF is the UNION-FIND improved version of equivP. Although the results 
indicate that HKe is not as fast as the direct comparison methods presented in the cited paper, it is clearly 
faster than any minimisation process. The improvements of equivUF over equivP are not significant (it 
is actually considerably slower for r. e. of length 100 with 2 symbols). We suspect that this is related to 
some optimizations applied by the Python interpreter. We state this based on the fact that when both 
algorithms are executed using a profiler, equivUF is almost twice faster than equivP on most tests. 

We have no reason to believe that similar tests with different implementations of these algorithms 
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Table 3: Running times (seconds) for tests with 10.000 random r. e. 





Size/Alg. 


Hop 


Brz 


AM 


Equiv 


EquivP 


HKe 


EquivUF 


k 


= 2 


10 


21.025 


19.06 


26.27 


7.78 


5.512 


7.27 


5.10 






50 


319.56 


217.54 


297.23 


36.13 


28.05 


64.12 


28.69 






75 


1043.13 


600.14 


434.89 


35.79 


23.46 


139.12 


60.09 






100 


7019.61 


1729.05 


970.36 


60.76 


48.29 


183.55 


124.00 


k 


= 5 


10 


42.06 


25.99 


32.73 


9.96 


7.25 


8.69 


6.48 






50 


518.16 


156.28 


205.41 


33.75 


26.84 


67.7 


21.53 






75 


943.65 


267.12 


292.78 


35.09 


25.17 


161.84 


28.61 






100 


1974.01 


386.72 


567.39 


54.79 


45.41 


196.13 


37.02 


k 


= 10 


10 


61.60 


31.04 


38.27 


10.87 


8.39 


9.26 


7.47 






50 


1138.28 


198.97 


184.93 


34.93 


28.95 


72.95 


22.60 






75 


2012.43 


320.37 


271.14 


35.77 


26.92 


195.88 


30.61 






100 


4689.38 


460.84 


424.67 


52.97 


44.58 


194.01 


39.23 



would produce significantly different ordering of its running times from the one here presented. However, 
it is important to keep in mind, that these are experimental tests that greatly depend on the hardware, data 
structures, and several implementation details (some of which, such as compiler optimizations, we do 
not utterly control). 

6 Conclusions 

As minimality or equivalence for (finite) transition systems is in general intractable, right-invariant rela- 
tions (bisimulations) have been extensively studied for nondeterministic variants of these systems. When 
considering deterministic systems, however, those relations provide non-trivial improvements. We pre- 
sented several variants of a method by Hopcroft and Karp for the comparison of DFAs which does not 
use automata minimization. By placing a refutation condition earlier in the algorithm we may achieve 
better running times in the average case. This is sustained by the experimental results presented in the 
paper. We extended this algorithm to handle NFAs. Using Brzozowski's automata, we showed that a 
modified version of Antimirov and Mosses' method translates directly to Hopcroft and Karp's algorithm. 
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